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We describe briefly the Palomar-Quest (PQ) digital synoptic sky survey, including its parameters, data processing, status, 
and plans. Exploration of the time domain is now the central scientific and technological focus of the survey. To this end, 
we have developed a real-time pipeline for detection of transient sources. We describe some of the early results, and lessons 
learned which may be useful for other, similar projects, and time-domain astronomy in general. Finally, we discuss some 
issues and challenges posed by the real-time analysis and scientific exploitation of massive data streams from modem 
synoptic sky surveys. 



1 A Brief Description of the PO Survey 

The Palomar-Quest (PQ) digital synoptic sky survey is a 
collaborative project between groups at Yale University and 
Caltech (Co-PIs: C. Baltay and S.G. Djorgovski), with an 
extended network of collaborations with other groups world 
- wide, including Indiana U. (M. Gebhard et al.), NCSA 
(R. Brunner et al.), LBNL Nearby SN Factory (NSNF; S. 
Perlmutter et al.), INAOE (Puebla, Mexico; L. CaiTasco, O. 
Lopez-Cruz et al.), EPFL (Switzerland; G. Meylan et al.), 
and Caltech/JPL (M. Brown et al.). The data are obtained 
at the Palomar Observatorys Samuel Oschin telescope (the 
48-inch Schmidt) using the QUEST-2 112-CCD, 161 Mpix 
camera (Baltay et al. 2007). Approx. 45% of the telescope 
time is used for the PQ survey. The survey started in the late 
summer of 2003, and will finish in the late 2008. 

In the first phase of the survey, data were obtained in 
the drift scan mode in 4.6° wide strips of a constant Dec, in 
the range —25° < S < +25°, excluding the Galactic plane. 
The total area coverage is ~ 15, 000 deg^, with multiple 
passes, ranging from a few to about 25, and typically 5 - 
10 times, with time baselines ranging from hours to years. 
There are some thin-strip gaps in the coverage, due to a 
combination of inter-CCD gaps, bad CCDs, and a subop- 
timal dithering strategy. Typical area coverage rate is up to 
~ 500 deg^/night in 4 filters. The raw data rate is on aver- 
age ~ 70GB per clear night. To date, about 25 TB of usable 
data have been collected in the drift scan mode. 

Data were obtained with two filter sets, Johnson UBRI 
and Gunn/SDSS rizz, recently changed to griz. Effective 
exposures are ^ 150 sec / cos 6 per pass. Typical estimated 
limiting magnitudes are w 21.5, i/™ w 20.5, z/^™ « 
19.5, Riim ~ 22, and lum ~ 21 mag, depending on the 
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seeing, lunar phase, etc. Coadding of ~ 8 passes reaches the 
depth of SDSS in the redder bands. Photometric calibrations 
are done independently at Yale and Caltech, mainly using 
the overlap region with SDSS. 

In the second phase of the survey, which started in the 
spring of 2007, data are obtained in the traditional point- 
and-track mode, in a single, wide-band red filter (RG610), 
with ^ 10% of the time in the drift scan mode. The cover- 
age and the cadence are optimized for the nearby supernova 
search, in collaboration with the LBNL NSNF group, and a 
search for dwarf planets, in collaboration with M. Brown. 

Data are processed with several different pipelines, op- 
timized for different scientific goals. This includes the Yale 
pipeline (Andrews et al. 2007), which does the PSF fitting 
and was designed for a search for gravitationally lensed 
quasars; the Caltech data cleaning pipeline, used to remove 
numerous instrumental artifacts present in the data; the Cal- 
tech real-time pipeline, used for real-time detections of tran- 
sient events, as described below; the LBNL NSNF pipeline, 
based on image subtraction and designed for detection of 
nearby SNe; and a pipeline for an optimal coadding of im- 
ages and detection of sources in them, now developed at 
Caltech. Images and resulting catalogs are stored in multi- 
ple locations, using a variety of databases. 

PQ is the first major digital sky survey fully designed 
and implemented in the Virtual Observatory (VO) era, and 
it uses VO standards and protocols throughout. Public data 
releases will be also done through VO-type interfaces. The 
first public data release is imminent, pending the completion 
of various data quality control and assessment tests. 

The survey is feeding multiple scientific goals and pro- 
jects. The initial motivation was a search for > 10^ QSOs, 
using colors and variability, in order to discover > 100 
strong gravitational lenses, and use them to constrain cos- 
mology and/or history of mass assembly. Another project 
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was a search for high-z QSOs, to be used as probes of reion- 
ization and early structure formation. Both of them are now 
finally starting to yield results; the progress was slow due to 
numerous problems with the data, all of which have been 
solved, and will be documented in detail elsewhere. Our 
principal scientific focus now is exploration of time domain, 
as described below. 

Our main public outreach effort to date has been the cre- 
ation of the Griffith Observatory's "Big Picture", and the as- 
sociated website, http://bigpicture.caltech.edu This exhibit 
will be seen by millions of visitors, serving multiple educa- 
tional roles in the years to come. 

2 PQ Exploration of the Time Domain: 
Some Preliminary Results 

With a data set covering nearly 40% of the entire sky, with 
multiple passes reaching 21 mag each, and time baselines 
ranging from minutes (between different CCDs) to hours 
(repeated scans in the same night), days (within the same 
lunation), months, and years, and (using the cross-matches 
to DPOSS and SDSS catalogs) up to decades, PQ is in a 
unique position to explore time-variable sky in a systematic 
fashion. For some early reports, see Graham et al. (2005), 
Mahabal et al. (2004, 2005), or Djorgovski et al. (2006). 

One major effort is a search for nearby (z ^ 0.1) SNe 
la, to be used as the low-z calibration of the Hubble dia- 
gram. This project is led by the Yale group in collaboration 
with the LBNL NSNF. To date, this effort has found a total 
of about 500 SNe, about a half of which were spectroscop- 
ically confirmed, and among them about 70 Type las with 
10 or more spectra taken; as well as a plethora of other SNe 
(including some peculiar ones) and transients. All are pub- 
lished in lAU Circulars, CBETs, and ATel's. The work uses 
image subtraction technique, in order to remove the well- 
detected light host galaxies. The Caltech real-time pipeline 
is now also starting to detect SNe, using a search for tran- 
sients in the catalog domain. 

We are now using the archives of our data to study sys- 
tematically the variability of QSOs, and especially Blazars. 
Some examples are shown in Fig. 1. The main goal is to 
devise an algorithm based on colors and variability alone 
to define a purely optically selected sample of Blazars, and 
thus check on the selection effects in the traditional radio 
and x-ray approaches. These sources may be the main con- 
tributors to the extragalactic 7-ray background, a subject 
of considerable interest with the upcoming launch of the 
GLAST mission. They are also implicated as sources of 
TeV-scale (and presumably even more energetic) 7-rays and 
ultra-high energy cosmic rays (UHECR). These cosmic ac- 
celerators can reach energies several orders of magnitude 
higher than any predictable terrestrial accelerators. Their 
census and detailed studies are thus of a considerable and 
growing interest. 

Our exploration of the archival PQ data has yielded a 
large number of transients, operationally defined as PSF- 
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Fig. 1 Examples of 3 known Blazars, as seen in the PQ 
data. The top row shows them in a relatively high state, the 
bottom row in a relatively low state. 



like sources detected in only one epoch, with no detectable 
apparent motion between different CCDs in a single pass. 
Subsequent studies have revealed counterparts for some of 
them in deeper, coadded images. We believe that many of 
them are probably asteroids caught near the stationary point 
(see below). However, this has underscored the need to de- 
tect and follow transients in a real or near-real time, in order 
to determine their physical nature. 

We have thus developed a real-time pipeline, which is 
now operational. The pipeline does the standard removal of 
instrumental signatures, pushes the data through the Cal- 
tech cleaning pipeline, detects and measures sources, imple- 
ments astrometry, compares the new catalogs to those from 
the previous passes, finds newly detected sources, imple- 
ments a number of software filters to eliminate the residual 
instrumental artifacts, known asteroids or variables, moving 
objects (uncatalogued asteroids), produces cutout images 
and webpages for the candidate transients, and publishes 
them using the VOEvent protocols and on VOEN website, 
|http://voeventnet.caltech.edu/fe eds/PQ_O T.shtnil| 

We typically do a 4-hour long scan, then re-scan the 
same area again, with the real-time pipeline running. In a 
typical half-night scan, we may detect a couple of million 
sources, and about a thousand potential transients. Removal 
of residual instrumental artifacts leaves a few hundred gen- 
uine detections, nearly all of which are asteroids; of them, 
typically only a half are among the previously catalogued; 
the rest are largely removed after the second scan. The num- 
ber of VOE vents generated has been evolving as the filtering 
procedures were improved. Over the past year or so, nearly 
4800 events have been submitted, with an average rate of ~ 
200 per night. About 85% of these were immediately clas- 
sified as asteroids, and the majority of the remaining ones 
are as well. Finally, there are only a few (< 10/night) ap- 
parently genuine astrophysical transients left. Spectroscopic 
follow-up observations to date show that they are a mixture 
of SNe, AGN, probable flaring M dwarfs, and the rest are of 
as yet unknown nature. Some are re-discovered on different 
nights. 
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Fig. 2 An example of a transient detected with our real- 
time pipeline, PQT 070519:143304+150707; see Drake et 
al., ATel 1083. The top row are detection images in the g 
and r bands; the bottom row are the comparison baseline 
images. The source faded slowly, but got redder rapidly; it 
may be a rare type of a SN. 

3 Some Lessons Learned 

Combining the current PQ experiences with the older work 
with DPOSS (see, e.g., Mahabal et al. 2005), we estimate 
that in a single-pass snapshot survey there are ~ 10^^ astro- 
physical transients/deg^ down to ^ 20 mag at high Galactic 
latitudes. Many of them are known, highly variable types of 
objects, where the "low state" is below the detection of the 
baseline data, with variable stars of different kinds domi- 
nating on the short time scales (^ minutes to months), and 
AGN (mainly Blazars and OVVs) dominating on the longer 
time scales (years and longer). Some are a variety of stel- 
lar explosions. Some may be as-yet unknown types of ob- 
jects and phenomena, but real-time spectroscopic and other 
follow-up is necessary in order to discover them. 

We find that a principal contaminant for optical surveys 
are the slow-moving asteroids; there are ^ 1 — 3 of them 
per deg^ down to ~ 21 mag, depending very much on the 
Ecliptic latitude; i.e., > 100 asteroids for each astrophysical 
transient. A joint analysis for moving and variable objects is 
necessary, and any type of a synoptic sky survey data stream 
can feed both scientific domains simultaneously. Improv- 
ing the existing catalogs of asteroids is an urgent task. At 
least two epochs are needed in order to eliminate previously 
unknown asteroids in any synoptic survey, and their base- 
line will define the effective time resolution of any transient 
search (we also note that at least 3 properly spaced epochs 
are needed to compute even a rough orbit). 

The quality of the baseUne or fiducial sky against which 
current observations are compared is a key issue. It must 
be deep, clean, complete, and wavelength-matched. Gen- 
erating a standard, dynamically evolving, annotated, multi- 
wavelength baseline sky may be a good community (VO) 



project; we are developing a prototype from PQ and other 
publicly available panoramic imaging data sets. 

Achieving a high completeness (a few real transients 
missed) and a low contamination (a few false alarms) is a 
huge challenge. Interesting sources are discovered as out- 
liers in some parameter space; problems with the data also 
generate outliers in some parameter space. In a large data 
set, most unlikely things will happen, and most of them are 
bad. Robust and reliable data cleaning is a key requirement. 
This is hard to do in a cutting-edge software system. 

Data systems (pipelines, archives, and analysis) and op- 
erational procedures for synoptic sky surveys are subject to 
a substantial tension between static and dynamic compo- 
nents, including both real-time and subsequent (non-time- 
critical) analysis and distribution, data ingestion, database 
updating and recomputing, etc. This has implications both 
for survey strategies and system architecture design. 

Another key challenge is an automated classification of 
events for prioritized follow-up, as discussed by Mahabal 
et al. and Bloom et al. elsewhere in this volume. This will 
certainly require use of machine learning tools, as described 
by Vestrand et al. in this volume. 

All of these challenges will grow much sharper, as the 
data volume and data flux increases dramatically in upcom- 
ing synoptic sky surveys. We are now dealing with data 
streams of the order of 0. 1 TB/night, and ~ 10 transients/nt. 
On a time scale of 1 — 5 years, this will increase to ~ 1 
TB/night and - 10'' transients/night (e.g., PanSTARRS), 
and on a time scale of 5 — 10 years, this will increase to 
- 20 TB/night and - 10^ - 10^ transients/night (LSST). 
Development and testing of software, methodologies, and 
operational and follow-up procedures is an urgent task, in 
which surveys such as PQ can play an important role. 
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